-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic import from CLDR #1107
base: master
Are you sure you want to change the base?
Conversation
@c960657 I appreciate your effort here, but I'm not inclined to accept your PR. It will change many i18n files and may break the overall stability. I'm not confident that the changes introduced are accurate. Based on the pt.yml modifications, I can tell you that in Portugal, we don't use the full stop in the month and day abbreviations. The proposed changes for units are incorrect, as you can see in this commit: 0b8193b Also, after a quick once-over, I see spaces after commas being removed and other situations that raise my concerns. |
You are right; these changes cannot be committed in bulk. However, I still think the overall idea has some merit. The tools may be a useful for translators to point out typos and suggest alternatives. Any changes can be submitted as separate PRs. Below are some cherry-picked examples and which look like errors in a quick glance (needs further investigation). This is the kind of issues that this tool will help translators find. diff --git a/rails/locale/af.yml b/rails/locale/af.yml
index 6529b9a..e91f3e9 100644
--- a/rails/locale/af.yml
+++ b/rails/locale/af.yml
@@ -48,7 +48,7 @@ af:
- Februarie
- Maart
- April
- - Mai
+ - Mei
- Junie
- Julie
- Augustus
@@ -100,7 +100,7 @@ af:
one: "%{count} jaar"
other: "%{count} jare"
prompts:
- second: Sekondes
+ second: Sekonde
minute: Minuut
hour: Uur
day: Dag
diff --git a/rails/locale/el.yml b/rails/locale/el.yml
index 448000f..4de13c6 100644
--- a/rails/locale/el.yml
+++ b/rails/locale/el.yml
@@ -22,7 +22,7 @@ el:
- Φεβ
- Μαρ
- Απρ
- - Μαϊ
+ - Μαΐ
- Ιουν
- Ιουλ
- Αυγ
diff --git a/rails/locale/fr.yml b/rails/locale/fr.yml
index fbdb8c5..6efc02a 100644
--- a/rails/locale/fr.yml
+++ b/rails/locale/fr.yml
@@ -177,11 +177,19 @@ fr:
decimal_units:
format: "%n %u"
units:
- billion: milliard
- million: million
+ billion:
+ one: milliard
+ other: milliards
+ million:
+ one: million
+ other: millions
quadrillion: million de milliards
- thousand: millier
- trillion: billion
+ thousand:
+ one: millier
+ other: mille
+ trillion:
+ one: billion
+ other: billions
unit: ''
format:
delimiter: ''
diff --git a/rails/locale/it.yml b/rails/locale/it.yml
index c0066a7..e0a0584 100644
--- a/rails/locale/it.yml
+++ b/rails/locale/it.yml
@@ -100,7 +100,7 @@ it:
one: "%{count} anno"
other: "%{count} anni"
prompts:
- second: Secondi
+ second: Secondo
minute: Minuto
hour: Ora
day: Giorno The tool would also be useful for adding initial translations of new strings, assuming that a translation imported directly from CLDR is better than no translation at all.
I don't know anything about Portuguese, but I can see that Google Sheets as well as the native Calendar app on iPhone uses period in abbreviations in Portuguese, so I would assume that variant is not completely wrong? But in any case, some of these strings, date formats in particular, exist in several valid, widespread variants, and for stability reaons I agree we should not change them unnecessarily.
This problem occurs, because Portugal Portuguese has the language code |
Your code's merit isn't in question. This is more about aligning with the project's philosophy and objectives. We usually accept contributions as they are, unless there are concerns from someone (including maintainers). This can lead to further discussions or seek input from native speakers. This PR suggests that CLDR translations are preferable to our existing ones and proposes a bulk update, which is why I won't merge it. You're welcome to submit individual PRs for each translation change. In cases where I'm unsure, I'll ask for feedback from native speakers. However, if you're considering introducing a new tool that compares with the CLDR project's suggestions, I wouldn't oppose its integration. This tool would ideally offer options to:
It would be a nice tool to have. About the full stop, in Portuguese, abbreviating words usually involves adding a full stop. However, there are at least one exception when the abbreviation is the last word in a sentence. Unfortunately, Rails i18n doesn't support these kinds of exceptions. IMHO, it's better to omit the full stop in the Portuguese translations and let users manage these specific cases themselves. Translations are not easy, and we can't always approach them strictly following language rules or international standards. This is especially true in a project like ours that is used in unknown contexts, and we want to keep it flexible. |
I suggest we fetch translations from CLDR when possible. This reduces the maintenance burden and ensures a general high translation quality, also for locales with few or no active contributors.
This PR adds an automatic importer based on ruby-cldr. The PR implements the lowest-hanging fruits. It may be possible to fetch even more data from CLDR, but that requires some further fiddling e.g. with date format patterns.
Obviously, this will introduce some changes to existing translations. These changes mainly fall into these categories:
The nativeen
locale in Rails defines some strings in title case, e.g. “Million” and “Bytes” etc., even though these are often used in a sentence, i.e. not as standalone labels. I would personally add these strings in their natural case, but for now I have converted them usingupcase_first
to be consistent with Rails. However, this changes the locales which have chosen to deviate from the upper-casing inen
. In either case, the convention should be consistent across all locales.I hope these changes are not too controversial. Otherwise, let's discuss how we can adjust the import logic.
How to
To build all files, run these commands:
If you only want to update certain strings, use
tree-filter
:i18n-tasks
doesn't seem to handle the locales stored in a subdirectory well. It creates duplicates of the files inrails/locale/iso-639-2
inrails/locale
. Delete these manually.